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Abstract 

In this correspondence, we describe the achievable rate region for rehably recovering deter- 
ministic functions of correlated sources which have a finite alphabet. The method of proof is 
almost the same as that used to prove the Slepian-Wolf theorem. 

1 Introduction 

Consider the problem of recovering a function F{X, Y) of two correlated sources {X, Y) by 
encoding the sources separately (see Fig. 0) A problem of this class was first considered in 
PP, where the exact rate region for the modulo- two adder source network was derived. In 
necessary and sufficient conditions were derived, for the achievable rate region for recovering 
functions of correlated sources to coincide with the Slepian-Wolf region [JJ- 

In this correspondence, we describe the achievable rate region for reliably recovering deter- 
ministic functions of correlated sources which have a finite alphabet. The method of proof is 
almost the same as that used to prove the Slepian-Wolf theorem 

2 System Model 

The system model is essentially the same as the one described in 1^. We repeat it here for 
convenience and notational clarity. 

Let X and y be a pair of correlated random variables defined on finite sample spaces ^ 
and S^, respectively. Denote their joint probability distribution by 

pxY{x,y)^PT[X ^x,Y ^y], xe,T,ye?^. (1) 

Conforming with the usual convention, we will use uppercase letters to denote random variables 
and lowercase letters to denote fixed values the random variables may take. Let (X,Y) = 
(X",y") — {{X-i,Yi), {X2,Y2), ■ . ■ , {X„,Y„)) be a sequence of n independent realizations of 
the pair of random variables {X,Y). The distribution of (X,Y) is given by 

n 

pxY{^,y)=Pi-[X = ^,Y = y]=l[pxY{x,,y^), xe^",ye^r". (2) 

!=1 

The number of coordinates in (X,Y) or (x,y) will be clear from context. 
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Figure 1: Illustration of the system model. 



Let F : y, W ^ be an arbitrary deterministic function. We will denote the sequence 
(F(Xi,yi),F(X2,y2),...,-F(X„,y„)) by F(X,Y). We will sometimes find it convenient to 
denote the random variable F{X,Y) by Z. Then Z = = F(X, Y). 

The sequence (Xi, X2, . . .) is available at node A and the sequence (Yi, Y2, . . .) is available 
at node B. We wish to reliably recover the sequence {Zi, Z2, ■ ■ ■) at node C, under the condition 
that there is no communication between nodes A and B. This situation is illustrated in Fig.0 
The channels from node A to node B and node A to node C are assumed to be noiseless. So 
we have a distributed source coding problem where the goal is to simultaneously minimize the 
required rates _Ri and R2, which allow reliable recovery of the sequence {Zi, Z2, . . ■) at node C. 

We now present some definitions similar to ones presented in ^ Section 14.4]. 

Definition: A distributed source code '^n{F) for the random variable F{X,Y) is a triplet 
of functions (/i , f2,g), 

fi : JT" {l,2,...,2"^i} 
/2 : ^" {1,2,...,2"-^^} 
g : {1, 2, . . . , 2"^i } x {1, 2, . . . , 2"^^ } ^ iT" 



where /i , /2 correspond to the encoding functions and g corresponds to the decoding function. 
Here Ri , R2 are nonnegative real numbers and n is a positive integer. 

Definition: For a particular distributed source code '^„{F), the probability of error is 
defined as 

Pl"^ = Pr[5(/i(X),/2(Y)) / F(X, Y)]. (3) 

Definition: A rate pair {Ri, R2) is said to be achievable for a function F if there exists a 
sequence of distributed source codes {^n{F) : n £ N} with corresponding probabilities of error 
pi"' such that pi"' — * as n ^ 00. 

Definition: For a particular function F, the achievable rate region !%{F) is the closure of 
the set of all achievable rate pairs. 

3 Main Result 

The following is the main result of this correspondence. 
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Theorem: The achievable rate region for a function F of correlated random variables 
{X, Y) is given by 



^(F) = {{Ri,R2) : Ri > H{F[X,Y)\Y),R2 > H{F{X,Y)\X), Ri + R2 > H{F{X,Y))}. 

The proof of this result is a simple application of the techniques used to prove the Slepian- 
Wolf theorem in 0]. So we shamelessly adopt the conventions and notation in [1] Chapter 14], 
if not for any other reason but to illustrate the simplicity of the proof. We need to borrow the 
following notation'^ before we proceed with the proof. 

Let {Ui, U2, ■ ■ ■ , Uk) be a finite collection of discrete random variables with a fixed joint 
distribution, p{u\,U2, . . . , it„), {u\,U2, . . . , it„) G x '2!'2 x ■ ■ ■ x '^fe. The set of e-typical n- 
sequences will be denoted by j4j"'([/i, U2, ■ ■ ■ , Uk)- We will denote the set of Ui n-sequences 
that are jointly e-typical with a particular Uj n-sequence, Uj, by Ai"\Ui\uj). 

Proof of Achievability : For each x € t!?^", set /i(x) to a value chosen from the set 
{1, 2, . . . , 2"^i } according to a uniform distribution. Similarly, for each y G set /2(y) to a 
value chosen from the set {1, 2, . . . , 2"^^ } according to a uniform distribution. The encoding 
functions are revealed to the corresponding encoder and the decoder, i.e., the decoder needs to 
know both /i and /2 while encoder i needs to know only fi, i — 1,2. 

The encoding operation consists of encoder 1 and encoder 2 sending the values of /i (X) and 
/2(Y), respectively, to the decoder. Given the encoder outputs (/i(X), /2(Y)) = (icjo), the 
decoder outputs its estimate of F(X, Y), Z, to be z if there exists a unique sequence z G 
such that (z,x,y) G A''"\Z,X,Y) for some (x,y) G ^" x such that /i(x) = io and 
/2(y) = jo- Note that the pair (x, y) need not be unique. 

The decoder operation is where the current coding scheme differs from Slepian-Wolf coding 
scheme. Of course, if _F is the identity function, i.e., F{x,y) = {x,y),\/{x,y) G ^ x iV, then 
the above decoder coincides with the decoder in the Slepian-Wolf coding scheme. 

We now proceed with the analysis of the probability of error averaged over all possible 
encoder choices /i,/2. Let _E = {Z 7^ Z} denote the decoding error event. Then we have 
E = EoUEiU E2U E12 where 

3 no z G J'" : (z,x',y') G Ai"'>{Z,X,Y) for some (x',y') 9 /i(x') = /i(X),/i(y') = /i(Y) 
El = |3z G 2r" ■- (z,x',Y) G Ai"^{Z,X,Y) for some x' 3 /i(x') = /i(X),z = F(x',Y) / F{X,Y) 
E2 = jaz G iT" : (z,X,y') G Ai"\Z,X,Y) for some y' 9 /i(y') = /i(Y),z = F(X,y') / F(X, Y)| , 
Ei2 = jaz G ■- (z,x',y') G Ai"\Z,X,Y) for some (x',y') 9 /i(x') = /i(X),/i(y') = /i(Y), 

z = F(x',y')/F(X,Y)| 

From the definition of jointly typical sequences it is easy to see that 

Pr[i5o] < Pr[(z,x',y') G ^" x ^T" x iT" : (z,x',y') ^ Ai"\Z,X,Y)] < e, (4) 



^See |[4| Section 14.2] for definitions and properties. 
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for sufficiently large n. We bound Pr[i5i] in the following manner. 

Pr[£;i] 

= Pr[3z e J'" : (z,x',Y) G A\^\Z,X,Y) for some x' 9 /i(x') = /i(X),z = F(x',Y) / F(X,Y)] 

'< Pr[3z e J"" : (z, Y) e A<"'(Z,y), for some x' 3 /i(x') = /i(X),z = F(x', Y) ^ F(X, Y)] 

= ^pxv-(x,y)Pr[3z G J^" : (z,y) G A'f\Z,Y) for some x' 9 /i(x') = /i(x),z = F(x',y) / F(x,y)] 

< ^pxy(x,y)Pr[(z,y) G A'f\Z,Y) : For some x' / x,/i(x') = /i(x)] 

< ^pxr(x,y)2-"«H^^"'(^|y)i 

where 

(a) follows from the fact that for any (z,x',y) G JT" x ^" x (z,x',y) G A^J'\z,X,Y) 
^(z,y)GA("'(Z,F), 

(b) follows from the fact that we are averaging over all possible encoder choices for /i and 
the property that for a fixed y G |{z G Jf"" : (z,y) G A^J'\Z,Y)}\ = \A^J'\Z\y)\, 

(c) follows from the fact that \A^r\Z\y)\ < 2"(«(^l'*')+2^) g] Theorem 14.2.2]. 

The final bound on Pr[i5i] tends to zero as n ^ oo if i?i > H{Z\Y) + 2e. Thus for sufficiently 
large n, Pr[iJi] < e. Similarly, we can show that Pr[i52] < e for sufficiently large n if i?2 > 
H{Z\X) + 2€. 

Note that Ei C -512 and E2 C £12. It then follows that E = Eq \J Ei VJ E2 1} E12 = 
So U Si U £2 U (S12 n Sf n -Eg). We will find it easier to bound E12 n Sf n Sj rather than 
bound E\2 directly. We bound Pr[£'i2 C\ ElC\ -E2] in the following manner. 

Pr[£;i2 n El n £2"] 

= Pr[3z G iT" : (z,x ,y') G A'f\Z,X,Y) for some x / X,yV Y 9 /i(x') = /i(X), 

/2(y') = ,f2(Y),z = F(x',y') / F(X, Y)] 

< Pr[3z G iT" : z G A^r\z) for some x' / X, y ' / Y 9 /i(x') = /i(X), /2(y') = /2(Y), 

z = S(x',y')/F(X,Y)] 

= ^pxy(x,y) Pr[3z G iT" : z G for some x' / x,yV y 3 /i(x') = /i(x), /2(y') = /2(y), 

z = F(x',y') /F(x,y)] 

< ^pxy(x,y)Pr[z G A^r\Z) : For some x' / x,y' / y,/i(x') = /i(x),/2(y') = /2(y)] 
x,y 

< ^ pxv (x, y)2-"^i 2-"^^ I (Z)| 
x.y 

<' 2~"'^i"'"^^^2"'^'^^"'"'''' 
where 



4 



(a) follows from the fact that for any (z, x', y') G J"" x x (z, x', y') e A^"' {Z, X, Y) 

(b) follows from the fact that we are averaging over all possible encoder choices /i , /2 and 
from the definition of Ai"\Z), 

(c) follows from the fact that \A'f-\z)\ < 2"(^(^)+0^ 

The final bound on Pr[iJi2 H Ei n E2] can be made smaller than e for sufficiently large n if 
Ri + R2> H{Z) + e. 

Thus, we have Pr[£;] < Vt[Eq\ + Pr[£'i] + PrfBa] + Pr[£i2 n El n E^] < 4e for sufficiently 
large n. Since the probability of error averaged over all codes is less than 4e, there exists at 
least one code '^*{F) for which the average probability of error is less than 4e. Since e was 
arbitrary, we can construct a sequence of codes such that Pi"' — > as n — > 00. The arbitrary 
choice of e also implies that any rate pair (_Ri,7?2) satisfying _Ri > H{F{X,Y)\Y), R2 > 
H{F{X,Y)\X), Ri + R2 > H{F{X,Y)) is achievable. Since the achievable rate region is the 
closure of all achievable rates, we have 

^{F) D {{Ri,R2) ■■ Ri > H(F{X,Y)\Y),R2 > H{F{X,Y)\X), Ri + R2 > H{F{X,Y))}. 

This completes the proof of the achievability. ■ 
Proof of Converse : This proof is once again very similar to the proof of the converse to 

the Slepian-Wolf theorem Section 14.4.2]. 

Let (-Ri, i?2) be an achievable rate pair. By definition, there exists a sequence of distributed 

source codes {"^n^F) : n £ N} and hence a sequence of function triplets {(/i"' , /2"\ 5'"') : n £ 

N}, with pi"' = Pr[ff(/i(X),/2(Y)) / P(X,Y)] such that Pi"' ^ as n ^ 00. 

For notational convenience, define 4"' = (X) and 4"' = /^"' (Y). By Fano's inequality, 

we have 

H(F(X,Y)|4"',4"') < Pi"' log I + l 

= Pi"'nlog|^| + l = n5„, (5) 

where 5n ~ Pi"' log We know that 5„ — » as n 00. Since conditioning reduces entropy, 
we also have 

J/(P(X,Y)|Y,4"',J("') < n5„, (6) 

J/(P(X,Y)|X,4"',J("') < n5„, (7) 

Following the notation in we will write U ^ V ^ W for some random variables 
U, V, W to mean that U and W are conditionally independent given V. For the problem under 
consideration, we have the following relations, 

(/("',ji"')^(X,Y)^F(X,Y), 
4"'^(X,Y)^(P(X,Y),Y), 
J("'-.(X,Y)^(F(X,Y),X). 

Application of the data processing inequality to each of the above relations and simple manip- 
ulations yield the following respective inequalities. 

//(4"\j("'|X,Y) < Jf(4"',ji"'|P(X,Y)) (8) 
H(7^"'|X,Y) < //(/("' 1F(X,Y),Y) (9) 
//(J^"'|X,Y) < /J(ji"V(X,Y),X) (10) 
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Then we have a chain of inequalities 

n{R, + R2) > )=/(F(X,Y);7("',j(") ) + //(/("', 4"' |F(X,Y)) 

> J(7^(X,Y); /("', J(")) + 4"' |X,Y) 



/(F(X,Y);/("\J("') 



(n) r(n)\ 



= J/(F(X,Y))-J/(F(X,Y)|/r,4 
> nH(F{X,Y)) -n6u, 

where 

(a) follows from 

(b) follows from the fact that {Iq"\ -^o"^) is a function of (X, Y), 

(c) follows from the chain rule and the fact that _F(X, Y) consists of i.i.d. components, and 
from JSJ- 

Similarly, we can write 

nRi > jY) 



= /(i^(x,Y; 

(a) 

> /(i^(x,Y; 

= /(i^(x,Y; 



/r!Y)+H(/rmx,Y),Y) 

4"'!Y) + //(4"'|X,Y) 



(n) j(n)\ 



= ff(F(X,Y)|Y))-/f(F(X,Y)|Y,/r,4' 

> nH{F{X,Y)\Y)^n5n, 

where 

(a) follows from 0, 

(b) follows from the fact that Jq"' is a function of X, 

(c) follows from the chain rule and the fact that H{F{Xi,Yi)\Yi) = H{F{X,Y)\Y) for i = 
1,2, ... ,n, and from (|SJ. 

Using similar techniques, we also get nR2 > nH{F{X,Y)\X) — nSn by using llUl and 0. 
Thus, for any n, we have Ri > H{F{X, Y)\Y) - R2 > H{F{X, Y)\X) - (5„ and Ri + R2> 
H{F(X,Y)) — Sn. Since 5„ ^ as n — > 00, we have that any rate pair is achievable only if 
i?i > H{F{X,Y)\Y), R2 > H{F{X,Y)\X) and Ri + R2 > H(F{X,Y)). Thus, 

■'%{F) C {{Ri,R2) ■■ Ri > H{F{X,Y)\Y),R2 > H{F{X,Y)\X), Ri + R2 > H{F{X,Y))}. 

This completes the proof of the converse. ■ 

4 Concluding Remarks 

We have found the exact achievable rate region for the problem of reliably recovering a function 
of correlated sources by separate encoding of the sources. The proof turns out to be a simple 
plug-and-play of the techniques in It is obvious that the achievable rate region found here 
reduces to the Slepian-Wolf region when F is the identity function. Although less obvious, it 
is not difficult to see that the result derived in this correspondence conforms with the results 

of HI, 121. 
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